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An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequen- 
tial sparse algorithms for degree-of- freedom reordering, supemodes. symbolic/numerical factorization, and for- 
ward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmet- 
ric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version 
of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics 
application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. 
Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The sec- 
ond and third assembly algorithms ha\e nearly equal performance at low values of source frequencies, but at 
higher values of source frequencies the third algorithm saves ('PI* time and RAM. The CPU time and the RAM 
required by the second and third assembly algorithms are two orders of magnitude smaller than that required 
by the sparse equation solver. A sequential version of these sparse algorithms can. therefore, be conveniently in- 
corporated into a substructuring tor domain decomposition) formulation to achieve parallel computation, where 
different substructures are handled by different parallel processors. 


Nomenclature 

= global stiffness matrix and load vector 
without source effects 
= global stiffness matrix and loads vector 
w ith source effects 

= local element matrices for a rigid wail duct 
- one-dimensional arrays containing sparse 
matrix coefficients 

= element stiffness matrix and loads vector 
= complex matrix coefficients 
= contributions to the element stiffness 
matrix due to the exit plane and 
interior elements 
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diagonal and unit lower triangular matrices 

d, 

= 

value of field variable at local node / 

Id 

= 

element degree-of- freedom matrix 

(EM 

= 

discrete error vector 

E, . V,„ 

= 

error and three-dimensional basis functions 

F 

= 

fill in resulting from matrix factorization 

(Eh 

= 

/ th component of ( F ) 

( E i V ) | 

= 

( F] vector of length N 

/a 

— 

source frequency and free 
space wave number 

H. L. \\ 

— 

height, length, and width of 
three-dimensional duct 

|//.-l|. \ha\ 

= 

matrix of pointers for sparse assembly 

l HMu 


coefficient of the / th row and ith column 
of \HA\ 

l HMK. A/)| 

= 

[HA | an N x M matrix 

/». /. w 


height, length, and width of 
three-dimensional finite element 

(Ml 


array containing the number of nonzeroes 
per row 

(/C). (/£}. [IET] 


arrays of starting locations of nonzero 
coefficients 

i/n.i/n 


permutation and inverse permutation 
vectors 

m. UC) 

= 

array of row and column indexes for 



nonzero matrix coefficients 

_ i 

[JA I 

= 

V * 

array containing the column numbers 
of the nonzero off-diagonal 
matrix coefficients 

[JE | 

= 

array of element connectivities 

{JET) 


array of element numbers connected 
to each degree of freedom 
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Subscript* 

exit, s 
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number of nonzero coefficients in a sparse 
matrix. N + N\ 

maximum number of elements connected 
to a degree of freedom 
array of element numbers connected to a 
degree of freedom 

array containing the number of elements 
connected to a degree of freedom 
master degree-of-freedom array 
maximum number of nonzero coefficients 
per row 

number of unknowns in the finite element 
discretization 

number of finite elements and degrees of 
freedom per element 

number of fill ins during factorization of a 
matrix 

total number of transverse, span wise, and 
axial nodes 

number of nonzero, off-diagonal 
coefficients before factorization 
number of nonzero, off-diagonal 
coefficients after factorization 
acoustic pressure field 
acoustic pressure at local node n: and 
source pressure 

relative error norm and uniform fiou speed 

surface and volume of a finite element 

nonzero value that was modined during 

matrix factorization 

Cartesian coordinates 

transverse, span wise, and axial locations 

of grid lines 

dimensionless exit admittance 
derivative of the acoustic pressure 
normal to a surface 

dimensionless wall and exit impedance 
global vector of unknowns 
local vectors ot unknowns 
intermediate vectors for forward and 
backward substitution 
vector components 
gradient vector and Laplace operator 
vector dot product 


exit and source plane index 
factored and reordered matrix 
forward and backward substitution 
row and column index of a matrix 


truss element number 
grid line locator for three-dimensional duct 
matrix or vector transpose 
complex conjugate 

I. Introduction 

T HREE-DIMENSIONAL aeroacoustics codes that can accu- 
rately predict the noise radiated from commercial aircraft are 
needed. ! Currently, noise prediction codes require the use of a linear 
equation solver before radiated noise can be predicted. .An optimizer 
must then run the noise predictive code on a digital computer hun- 
dreds of times to achieve an aircraft design with a minimal noise 
radiation signature. 

Currently, industry and government aircraft noise predictive 
codes are either two-dimensional or treat only axisymmetric noise 
signatures. 1 When the volumes arc three-dimensional, the currently 
used equation solvers require an excessive amount of CPU time 
and RAM for their assembly and solution. This excessive CPU time 
and computer storage restricts aircraft noise prediction codes to 
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low-frequency sound sources in two-dimensional or axisymmetrr 
environments. 

Sparse equation solving technologies^ 14 have been developed 
and are well documented for several engineering applications, -n : 
the computational advantage of sparse solver technology ovtr 
more conventional technologies (such as band or skyline soi 
has been demonstrated. In addition, for practical engineering id- 
eations. system matrix equations must be developed for an unstruc- 
tured grid to which boundary conditions are often difficult to app:\ 
The finite element method is the simplest for generating the sxsterr. 
matrix on an unstructured grid. 

Only recently have sparse solver technologies been applied to 
aeroacoustics. 1 ‘ 1? In Ref. 1. several direct and iterative equation 
solvers were evaluated to determine their applicability to two- 
dimensional duct aeroacoustics computations with the direct sparse 
solver emerging as the most promising. In Ref. 15. sparse mv . - 
equation solving methodology was extended to three-dimens i 
acoustically lined ducts. However, the work presented in Re: .5 
adopted the assembly strategy that is currently available in the lit- 
erature for assembling system sparse matrix equations. This simple 
but inefficient assembly strategy precludes the use of sparse sobers 
for three-dimensional aeroacoustic computations. 15 

Most, if not all. major codes for analysis and optimal design al- 
low users to select either iterative or direct equation solver*. For 
nacelle aeroacoustics computations, iterative solvers are not a< ro- 
bust as direct solvers because the nacelle equation system is poor K 
conditioned. 1 Iterative solution methods, when applied to system' 
poorly conditioned equations, have the disadvantage that they do . 
converge, or they converge very slowly. A further disadvantage o: 
applying iterative solution methods to solve the nacelle equation *> 
tem is that the nacelle equation system often contains multiple right- 
hand sides, heratix c methods are not as efficient as direct method* on 
equation systems with multiple right-hand sides because the equa- 
tion system must be reformed and resolved for each right-hand *ide. 

The long-term objective of this research is to acquire the capabil- 
ity to design quiet aircraft in a fully three-dimensional aeroacoustic 
environment using direct sparse solver technologies and the finite 
dement methodogy. The current paper has iwo objectives. The 
objective is to bridge the gap between the aeroacoustic ian* < v- 
may not have a comprehensive know ledge of sparse assembh and 
equation solver technologies) and members of the sparse research 
community (who may not have comprehensive knowledge of fi- 
nite element analysis and aeroacoustics). The second objective to 
present efficient algorithms for assembling sparse matrix equations. 

Section II describes three sparse assembly algorithms for gener- 
ating systems of sparse linear equations. Section III describe* the 
template that is used to develop a complete, unstructured grid, fi- 
nite element code, that is. equation reordering, symbolic/numencal 
factorization, supemode s/loop unrolling, and forward/back ware ' - 
lution phases. Section IV presents a detailed formulation of the *. 
ement stiffness matrices that will be assembled using the sparse as- 
sembly algorithms to form the system matrix fora three-dimensional 
duct aeroacoustics application. Finally. Sec. V discusses the accu- 
racy and numerical performance of the developed algorithms over 
the frequency range of interest for a three-dimensional aeroacoustics 
application. Note that although the sparse algorithms presented as- 
sume that the system matix equation is symmetric, these algorithms 
are easily extendible to nonsymmeinc systems of equation*. The 
algorithms can also be conveniently incorporated into a substruc- 
tunng (or domain decomposition) formulation to take advantage 
parallel computation to further reduce CPU time and RAM. 

II. Sparse Assembly Algorithms 
for Symmetric Systems 

Figure 1 is a two-dimensional truss (or rod) structure assembled 
from individual truss elements labeled (1), (2),..., (13) that are 
interconnected at eight nodes labeled 1 , 2, .... 8. An element ( e ) of 
the structure is assumed to possess only two points of connection, 
and the external loads are assumed to be applied at the nodes of the 
truss elements. Only a single degree of freedom (DOF) at each node 
is assumed. To further simplify discussions, it is assumed that, by a 
separate calculation, the dement stiffness matrix and external load 
vector for the truss element (e) are known and expressed as 


1 >:/ j 


l^’j 


Under the assumption of Eq. (1), the 13 truss elements (Fig. 1) 
may be assembled using the rules of finite element assembly 16 to 
obtain the system matrix equation 
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iO(,V)| = 



Although only a single DOF \ ? ns assumed at node / . the di$~ 
^'Mon to to How is easily extended to a DOF per node by extending 
: - coefficients in [A]. that is. A'jj. loq x q submatnces. The rules 
’‘tutrix algebra would then oe applied to each q :< q submatrix as 
- vv ere a scalar. 


Sparse Data Formats for the System Matrix 

F°r the sake of brevity, m the discussions to follow it will be 
turned that the element stiffness matrix is symmetric so that 



K'L - n <« 


Under the assumptions of Eq. (6), the system matrix [A\ is also 
symmetric and can be written in the form 

[A<\..V)] 
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The sparse descriptions of any symmetric system matrix | A | (see 


Eq..?)] is: 

fully described by the four one-dimensional 

vectors 

!/A(.Vi| = 

14.2.3. 2. I. 1 . 0 . 0) r 


{JAiM)\ : 

= (2. 4. 5. 7. 3. 5. 5. 6 . 8 . 5. 7. 6 . S) r 

( 8 ) 

(AD(.V)l = 

= [An. At;, Ayy, A 44 , A 55 , A 66 * A—. Aksi' 
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B. Application of Boundary Conditions 

In most engineering applications, the field variable at several 
boundary nodes may require constraints to satisfy a Dinchiet bound- 
ary condition of the form 


{*)/ =di (ID 

where d f is the specified value of the field variable at node I . 
Dinchiet boundary’ conditions may be applied at the element or 
system level. The impact of applying Dirichiet boundary conditions 
on the system matrix equation is identical whether applied at the 
element or system level. We will show the relatively easy process 
of applying Dirichiet boundary conditions at the element level and 
their impact on the system matrix equation [Eq. (2)]. 

The process for inserting the Dirichiet boundary condition. 
(O)/ is as follows: 

\ ) The coiumn of [A l<M ] corresponding to the Ah DOF is multi- 
plied by d,, and the result is subtracted from [P' 1 }. 


5 



2) The column corresponding to the /th DOF in IA { *'] is made 
^zero. 

. 3) The; row corresponding to the /th DOF in [A iel ] is made zero. 

4) The modified element matrix and the modified element load 
vector are assembled. 

5) [A)u is made equal to unity, and {F}, is made equal to d } . 
Thus, applying Dirichlet boundary conditions to the system ma- 
trix equation modifies Eq. (2 ) to 

M](<D) = {F) 

The numerical values of the coefficients in the modified system 
matrix 1.4] remain unchanged from those in 1 A), except for a few 
that are made zero during the application of the Dirichlet boundary 
conditions. Therefore, we w ill illustrate application of the assembly 
algorithms to the nonzero pattern of [A]. 

C. Sparse Assembly Algorithms 

Three symmetric sparse assembly algorithms will be explained 
in this section. The purpose of each assembly algorithm is to gen- 
erate the system loads vector {F} and the four vectors denned by 
Eqs. ( S— 10). which correspond to [A). The assembly algorithms 
are discussed starting w uh the simplest and proceeding to the most 
complex. 

; Aii'onthm I 

The mam ideas of this algorithm can be summarized by the fol- 
low mg computational tasks: 

1 5 Find how manv and which elements are connected to each 
DOF. 

a i Input data for ,Y..V£..\7\ and the elements connectivity isee 
Fig. 1). 

hi Compute the number of elements associated with each DOF. 
and store this information into the one-dimensional integer \ector 
|A/F<Y)|. 

c> Find the element numbers associated w ith each DOF. and 
store this information into the two-dimensional integer matrix 
;UA/(.\..V/F)|. 

2 1 Retrieve the stiffness matrix attached to each DOF. perform 
sparse matrix assembly one row at a time, and extract the four one- 
dimensional vectors required for the sparse equation solver. 


r M^nruhni 2 

The nonzero patterns of the symmetrical matrix \A] (see Eq. i7)J 
lor the two-dimensional truss example problem (Fig. 1 ) can be com- 
pletely described by the two one-dimensional integer vectors 

;/A*i M )) = 1. 1.4.4. 1. 1.2. 1. 5. 4. 2. ?. 2. 3. 5. 6. 3. 3. S. 6J r 

(13) 

!iC< A/)| = (7.7. 1.7. 4. 4. 2. 2. 5. 5. 5. 3. 3. 5. 5. 6. 6.6. 8. S.8j r 

(14) 

and the following integer matrix: 



The two one-dimensional integer vectors in Eqs. (13) and ( 14) are 
constructed by cycling through each element (e ) in increasing order 
and then determining the row' and column index of each nonzero 
coefficient from the connectivity array for each element (Fig. 1). 
Note that the matrix [HA] contains locations (or pointers) that are 
used to refer to vectors {//?} and {/C}. For example, the values 
of [//A] 4 , =4, [HA] 42 = 5. and [HA] a3 = 1 1 indicate that row 4 of £ 


mairix [A] will have these nonzero terms. The exact locations (row 
and column numbers » of those three nonzero terms in [A] can be 
referred to as 
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Thus, the three nonzero terms of the fourth row of [A] are located at 
row 4. column 7: row 4. column 4; and row’ 4. column 5. respectively 
[see Eq. (7)). 

The integer matrix [HA] and system matrix [A] can be alterna- 
tively stored as one-dimensional vectors: 


\ha<\1)\ = (2.3.6. '.9.8. 12. 14. 13. 15. 18. 19. 

4.5. 11. 10. 16. 17.21. 1.20} r <]7 

|a(A/)) = {A\V. A ''\ . .4; 2 . A\:. Ay.. A'*, 4‘g. A™. A'[. 

At*- .A'. 1 /', a 1 '. . A\'\ 4'.g. A'?,. Ay. A\'"\ 4kk. A„. 4-)' 


The main ideas of algorithm 2 can be summarized by the following 
computational tasks: 

1 ) After initializing \fw} to a zero vector, process all elements un 
ascending order) to obtain the integer vectors {1R} and (7C1 while 
assembling [A] into ,j). 

2) Separate the diagonal and nonzero off-diagonal terms of j A 
from {a) and store this information in (AD) and (AA r ). Separate th 
diagonal and off-diagonal terms in {//?) and {7C}. and compute \JA 
and {/A j. 


3. Ali?onriim 3 

The nonzero palters of the symmetrical system matnx [A] can 
be completely described by {JA) and the follow ing one-dimensional 
integer vector: 

(/O.V-s-l ; = (1.5.7. 10. 12. 13. 14. 14. 14) r (|9> 


;/4|, = (/Cl,.,-(/C|, (20 

and the variable \ ] can be conveniently computed as 

M = |/CK.,-{/C}, 

The element connectivity information for the two-dimensional 
truss sample problem (Fig. 1) is full) described by the eiemeni- 
DOF matrix 

*10 0 0 0 0 10* 

0 0 0 10 0 10 

10 0 10 0 0 0 

110 0 0 0 0 0 

10 0 0 10 0 0 


[£(.V£. , Y i ] : 


0 0 0 1 10 0 0 

0 1 10 0 0 0 0 

0 10 0 10 0 0 
0 0 I 0 1 0 0 0 

0 0 0 0 1 10 0 

0 0 10 0 1 0 0 

0 0 10 0 0 0 1 

0 0 0 0 0 10 1 


*ach row of [£] contains exactly two nonzeros because each element 
tas two points of connection, or nodes, to the structure. Thus, [£]// i* 
lonzero only if node J is a node for element / . For example, the first 
ow of [£] contains a unit value only in columns 1 and 7. indicating 
hat the first element of the truss is connected to nodes 1 and 7 only 



(rig. l i ne concept oi an eismenM^ur matrix is easily extended 
to q DOF per node by extending each of the unity* coefficients in 
[£] to a q x q identity matrix. 

• To minimize the RAM, it is convenient to describe the element- 
DOF matrix [£] by the two one-dimensional vectors 

[IE(NE 4 1)} = {1, 3,5, 7, 9, 11, 13, 15, 17, 19,21.23, 25.27} r 

(23) 

[JE(NExNP)} = {7. 1.7. 4, 4. 1. 1,2. 1.5, 

4. 5. 2. 3. 5. 2. 5. 3. 5. 6. 6. 3. 3. 8, 6. 8} r (24) 

and the transpose of the element-DOF matrix <[£] r ) by the follow- 
ing two one-dimensional vectors: 

{IET(X 4- 1)J = {1.5.8. 12. 15. 20. 23. 25. 27} r (251 

{JETiXE X NP\) = {1.3. 4.5. 4. 7, 8.7,9. 11. 12.2.3. 

6.5.6. 8. 9. 10. 10. 11. 13. 1.2. 12. 13) r (26) 

The main ideas of algorithm 3 can be summarized by the following 
computational tasks: 

1 ) Assume that [/£). {/£}, {/£TJ, and {JET) have already been 
defined from the connectivity information (see Fig. 1 ). 

a) Compute j/C} and {/4} (symbolic assembly phase!. 

b) Compute {74} from Eq. (20). 

2) Assume that vectors [IA \ and \JA) have already been dehned 
from the symbolic assembly (task 1 ). Compute [AN] and {AD} from 
[A' v '\ (numerical assembly phase!. 

Ill, Sparse Algorithms for Solving 
Symmetrical Equations 

In this section, the major tasks involved in solving sparse svs- 
:ems of linear equations are briefly explained. The success of the 
sparse solver is due to improved technologies u.e.. equation re- 
ordering, matrix decomposition, supemodes and k>op unrolling, tor- 
ward backward solution phases) and bookkeeping strategies ideal 
for implementation on a digital computer. More detailed information 
on improved technologies can be obtained from Refs. 2-14. 


A. Sparse Reordering Alfiorithms 

After imposing the boundary conditions, the modified stiffness 
matrix |.4| can be obtained from [.4| as indicated in the discussions 
efore Eq. ( 12). Equation ( 12) should never be solved directly To 


funner simplify the discussions, we 
the following numerical values: 


110 7 

7 112 



5 0 

3 0 


sill assume that matrix [ A 1 has 

4 0 5 3 ' 

0 2 0 0 

66 0 0 0 

(27) 

0 11 1 0 

0 1 88 0 

0 0 0 44 


Thus, in this case A = 6 and V 1 = 6. Dunng the factorization phase, 
many of the zero- value terms appearing in Eq. (27 ) may become 
nonzero. For maximum efficiency of storage and solution time, the 
equations are reordered so that the number of nonzero terms that 
occur dunng factorization are minimized. These extra nonzero terms 
created dunng the factonzation of [A] are referred to as till ins and 
are denoted by the symbols F in the following equation: 


[A F (\ J . X)] 


X X X 0 X X 

X F X F F 

X F F F 

X X F 
X F 
X 


(2$ l 


in Lq. (28), onelus eight extra (or new; nonzero nu ms. Asa result, 

NF = 8 (29) 

N2 = AH 4 NF = 648 = 14 (30) 

In general, the number of nonzero coefficients in the upper triangular 
pan of [A] after factorization (N 2) is much larger than those before 
factorization (AM). 

The purpose of reordenng algonthms [multiple minimum degrees 
(MMD), nested dissection, or METIS algorithms) is to rearrange 
the nonzero terms of [A], defined in Eq. (27), to different locations 
so that ;V 2 is minimized. 5 17 “ :: ' For example, applying the MMD 
reordenng algorithm to [A] will result in the following permutation 
and inverse permutation sectors: 

{/P(.V)J = {5.6.3. 1, 4. 2} r . \l\'(N)\ = {4. 6. 3. 5. 1.2) r 

(31 i 


With the permutation array {IP}, the matrix [A] in Eq. (27) can be 
transformed into 


[AtfGV..VV) 


11 0 0 1 0 2 

0 44 0 0 3 0 

0 0 66 0 4 0 

1 0 0 88 5 0 

0 3 4 5 110 7 

2 0 0 0 7 112 


(32) 


Now. if one factorizes [A*], there will be only one fill in that 
occurs, as follows: 


| A/i/:{ A. A)] 


"X 0 0X0 X* 

X 0 0 X 0 

X 0 X 0 

X Y F 

X Y 
X 


(33 > 


B. Sparse Symbolic Factorization 

The reordered matnx [ A* J can be described by the following four 
one-dimensional vectors: 


[IA(S - li) = {1.3. 4. 5. 6. 7. 7| r 


{.Ml.Yl.i = (4. 6. 5. 5. 5. 6}' 1 34 > 


{AD(X)) = {11.44.66.88. 110. 112)' 

{AA'f.Y 1 ij = {1.2. 3.4.5. 7| r <35> 

In this example. .V =6 and .V 1 = 6. Before performing the numer- 
ical factorization, it is necessary to go through the sparse symbolic 
factonzation. so that the following hold true: 

1) The nonzero pattern of [A/?y] can be determined (including 
the locations of fill ins ). 

2) The value of N2 can be determined so that adequate com- 
puter memory can be allocated for the subsequent sparse numerical 
factorization phase. 

On completion of the sparse symbolic factonzation phase, the 
nonzero patterns of [ Arf] are completely known, and the modified 
versions of Eqs. (34) and i?5) for the factored matnx [ A/?f j can be 
computed as 


{/A( /V -r 

1)1 = (1.3. 4.5. 7.8. S!' 


{/A(A f 2 

)! = 14. 6.5. 5.5.6. 6| r 

(36i 

N 2 = 

V 1 - ,V£ = 6 + 1 = 7 

(37) 
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Efficient sparse symbolic factonzation algonthms and detailed 
FORTRAN coding can be found elsewhere. 1 


C. Finding Supemodes 

To understand the concept of a supemode tor master node), notice 
that, in Eq. (33). rows 2-3 and 4-5 have the same nonzero patterns. 
That is. the nonzero terms in rows 2-3 correspond to the same 
column numbers. Equation (33) can be used to define a master DOF 
vector 

[MS(N)} = (1.2. 0,2.0. 1} T (38) 

The master DOF vector {MS} is based on the assumed system matrix 
[ Arf ] defined in Eq. (33). Once Eq. (3S i has been defined, effective 
loop-unrolling techniques 2 23 can be used to improve computational 
speed during the sparse numerical facionzation phase. 

D. Sparse Numerical Factorization Phase 

The strategies employed in this phase are quite similar to the ones 
used during the sparse symbolic facionzation phase and have been 
well documented in the literature. 5 - 24 The reordered system matrix 
[ A H } can be decomposed or factorized as 

(^] = [ar][n r (39i 




Source none 
pressure. p t 


Fig. 2 I nree-dimerisioaal duct and coordinate system. 

A. Mathematical Formulation 

The mathematical formulation of the duct acoustics problem 
(Fig. 2) does not lead to a boundary value problem that is for- 
malh self-adjoint and will not lead to a symmetric system ma- 
trix when airflow is considered. Thus, the analysis in the foregoing 
discussion does not allow for airflow because the current paper fo- 
cuses on symmetric systems. With zero airflow in the duct = 0 j_ 
the mathematical problem is to find the solution to Helmholtz'^ 
equation 15 

V 2 p + k~p = 0 (44) 

Along the source plane of the duct t: = 0». the boundary condition 
is given in term of a Dirichiet boundary condition: 

p ~ p, (45i 

The wall boundary condition is 


E. Solution to the System Matrix Equation 

The solution to the system matrix equation [Eq. (12)] is obtained 
in three phases: 

1) In the first phase (forward solution phase), an intermediate 
solution vector \Q>ff I is computed from the solution of the matrix 
equation 


= — IK — (-40 f 

dn 

At the duct termination iz — L ). the ratio of acoustic pressure to the 
axial component of acoustic panicle velocity is proponionai to the 
known dimensionless exit impedance. When expressed in terms of 
the acoustic pressure, this boundary condition is 


[£}\<*>Ff} = \Fh) (42) 

2) In the second phase (backward solution phase), a vector {$bs} 
is computed from the matnx equation 

mC] T {*BB) = 14 >ff) (43) 

3) In the third phase (back transformation phase ). the vector ( <&bb } 
is transformed back to the onginai unknown vector {<J>} by utilizing 
the inverse permutation vector {/V}. 

IV. Three-Dimensional Aeroacoustics Application 

The developed algorithm will be exercised to study the propa- 
gation of acoustic pressure waves in a three-dimensional duct lined 
with sound absorbing materials ( acoustic liners ) as depicted in Fig. 2. 
The duct is spanned by axial coordinate r. transverse coordinate .v, 
and span wise coordinate y . The source plane is located at r = 0. and 
the source plane acoustic pressure p , is assumed known. At the exit 
plane, the dimensionless exit acoustic impedance £«» is assumed 
known. In the duct, air is flowing-along the positive c axis at a sub- 
sonic speed of w 0 , and the duct has acoustic liners along its upper, 
lower, and two sidewalls. The duct walls are assumed to be locally 
reacting so that the sound absorbing properties of the acoustic liners 
results from the dimensionless wall impedance £ that is assumed 
known. The sound source pressure, dimensionless exit impedance, 
and dimensionless wall impedance are assumed functions of posi- 
tion along their respective boundaries. 


OR Sexit 

Equations (44-47) form a well-posed boundary value problem 
for which exact solutions for the acoustic pressure field are gener- 
ally not known. A solution for the acoustic pressure field satisfying 
this boundary value problem is required to predict and reduce the 
radiated noise. An approximate solution for the acoustic pressure 
field can be obtained using numerical techniques such as the finite 
element method. 

B. Finite Element Model 

The approximate solution for the sound field in the duct is ob- 
tained by subdividing the duct and representing the acoustic field 
within each subdivision by relatively simple functions. Because the 
duct of interest is a rectangular prism, the computational domain is 
divided into a number of smaller rectangular prisms (or elements) 
as shown in Fig. 3. These elements are considered interconnected at 
joints called nodes. The most widely used method for locating the 
nodes in the discretization is to divide the physical volume into NX , 
and NZ grid lines in the a\ y . and r directions, respectively, as 
shown in Fig. 3. Each node of an element can be located by iden- 
tifying an ordered triplet. >v, z k ). Similarly, each element in 
the assemblage can be identified by an ordered triplet of integers 
(/, J. K). A typical rectangular pnsm element (/, y, K) is shown 
in Fig. 4. Each element consists of eight local node numbers labeled 




Fig. 3 Three-dimensional finite element discretization. 



The linear combination JEq. (49)] comprises a complete set of basis 
functions. 

For a typical element (/, /. K ). contributions to the minimization 
of the held error function due to local node m over the computational 
volume V are 

J E r N m dV = j^lV : p + k : p]N m dV (51) 

The second derivative terms in Eq. (51) are reduced to first deriva- 
tives using Green's second identity 


dV = JT [-{ V}p • { V}W m -j- k z pN m ] dV' - j d5 


A 



Fig. 4 Typical three-dimensional dement and local node numbering 
system. 


i. 2 8. Each element is considered to have a dimension of h . 

and / in the a . y. and : directions, respectively, as shown. 

C. Element Stillness Matrix 

GalerkinY finite element method is used to compute the element 
stiffness matrix. The neld error function is defined as 

£ = V~{) -f k~ p (48) 

Within each element, p is represented as a linear combination of 
eight functions. .Y ; . .Y : 


ni = s 

p = y"! V (49) 

m = 1 


Elimination of the second derivative terms from the volume inte 
gral in Eq. (51) is required so that the linear basis functions N m 
can be used. Elimination of the second derivative terms from the 
volume integral also has the advantage that ail impedance boundary 
conditions can be incorporated into the surface integral of Eq. (52). 
This allows a choice of basis functions that do not have to satisfy 
explicitly any impedance boundary conditions. The contribution to 
the surface integral 


L 


dp 

— N~ d5 

tin 


(53) 


is identically zero for all elements except those that lie along 
an impedance boundary'. Substituting ihe exit boundary condition 
[Eq. i-Tij mto the surface integral in Eq. t53 > gives 



[%N. 

„ dS = -ik 

P 

-r~: \\ dS 


(54) 


JS l * n 

Js 



along 

the exit boundary. 

whereas for elements that 

lie aloiii 

: the 

upper. 

lower, and sidewalls of the duct 





f — S 

dS = -ik j 

f - V„, dS 


(55) 


Js < ln 

J 

.t c 




The contribution to the minimization of the field error for each 
element, when collected for each of the eight local nodes m , is 
expressed in matrix form as 


/ 

/ 

/ 


E,N,dV 


E r N Z iiV 


ErNtdV 


= [A i J «']{<*> ' J « 


(56 1 


V; = (' _ s)(t)(' _ ?) "’“T. 

•'■‘■(iK'-jK 1 -?) 


In Eq. 56). JK] ) is an 8 x 1 column vector for each element 
containing the unknown acoustic pressures at the eight loeui nodes 
of the element 

{ c t > ; ' c ’)‘ = {p\- Pz- py- Pi- Pz- Pf p-. P»\ (57) 

The element matrix [4 {/ y A l ] is an 8 x 8 complex symmetric 
matrix for each element (/, J . K). In the special case of a hard wall 
duct i ; — oc ). 




[V). 

m+iB]. 


K = I.VZ - I)| 
K = tSZ - 1 ) j 


3 


( 58 ) 


Here. \V] represents the contribution to due to the element 

volume \ T . whereas f B ] represents the contributions due to the exit 
plane boundan The matrices [P] and [5] are symmetric, and their 
coefficients ha\e been computed explicitly: 


k 2 whl 
VP] = 

11 ->th 


4 : i 


vol hi wh 

sbh 36 u 36 / 


4 2 14 8 


f? — fiexix VX f . Xj ) -j- ficxil (' / . }'j — J ) 

Arxit (A / -► 1 « }'j + 1 ^ 4" (*V/ • Vy — I ) 

/a — (A / . \'J ) + (.Y / . Vy -r 1 ) 

~ A:ui<-V/ * | . Vy * I ) + 3^ CXI , (A / . Vy * 1 ) 
./? = 3/i exit (■V/. >v) + 9/J exit (-V/. \ / _ i ) 

— 3 A rx:ti-V/ _ I . Vy _ J ) 4 &\nU/ . > 7 - i ) 

= ftrxuLl / . Vy ) 4- 3/*e\,i(-V/ . Vy j ) 

— 3/v v ,U/ * i , V/ | ) 4* A?*u(-V/ . Vy . j ) 
/- = ftx.tU/ . Vy ) 4- 3/* eMl (.Y, . Vy - j ) 


2 4 4 4 4 8 


“ 9ft. XI ,u i - 1 . y./ ♦ i ) 4- L'XIl (.V/.Vy.,) 


4 2-2-4 2 1-1 

2 4 -4 -2 1 2-2 

-2 -4 4 2 -1-2 2 

-4-2 2 4 -2 -1 1 


1 2 - 2-1 
- 1-2 2 1 


4 2-2-4 

2 4-4-2 


J ' = Vy ) 4“ Pcmi (A/. Vy * j ) 

— 3[0 ovl ,(V/ ^ 1 , Vy * j) 4- ft exit 1 A / . X j _ ] ) ] 
= 3/J CX |,(A / . Vy ) 4- ftrxitlA / . Vy _ | ) 

- 3/J, KlT l.Y/ . j. Vy - | ) 4 9/)*, xlI (A/ . Vy _ , ) 


in which 


2 -4 


4 -4 -2 2 2 -2 -1 1 


"i _ 


2 2 4 -4 -1 1 2 

2-2-4 4 1-1-2 


4 -4 


-1 -4 


_ *> "> -> 


4 -4 


-4 -2 -1 -2 

-2 -4 -2 -1 

-I -2 -4 -2 


-2 -4 


-4 -2 -1 -2 4 

-2 -4 -2 -I 2 

-2 — \ -2 l 


0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 


151= -K 


0 0 0 0 0 0 0 0 

0 0 0 0 j\ J: j\ u 

0 0 0 0 /: U ft h 

o o o o h ft fi h 

0 0 0 0 U h h h 


V. Results and Discussion 

The three-dimensional rigid wall acoustic element has been cou- 
pled with the sparse assembh and equation solver algorithms to 
pro\ ide assembly and solver statistics for a three-dimensional duct 
aeroacoustics application. Computations presented in this paper 
were run on a single processor with double-precision < 64-bit i arith- 
metic on an ORIGIN 2000 computer platform. The sparse equation 
sober used MN1D reordering. Computations are presented for a 
uniform grid and a geometry identical to that of the Langley Flow 
Impedance Tube. This three-dimensional duct has a square cross 
section 0.050S m in width (U = H =0.0508 mi and 0.812 m in 
length [L =0.812 m). A more detailed description of the duct is 
given in Ref. 15. All calculations were performed ai standard at- 
mospheric conditions without flow, and the source frequency was 
chosen to span the full range of frequencies currently of interest in 
duct liner research. The sound was chosen as a plane wave ( p, = 1 ). 
and the dimensionless exit impedance was chosen as unity ( £ exj , = 1 h 
This exit impedance will simulate a nonreflecting termination for 
the plane wave source. 

Table 1 presents CPU statistics ( in seconds ) for each of the three 
assembly algorithms and the sparse equation solver as a function of 
the source frequency /. in kilohertz. The CPU time for ihe solver 
t column 9 1 is that required to obtain the solution vector after the sys- 
tem matrix was assembled. Note that before obtaining the solution 
vector, the system matrices obtained from each assembly algorithm 
were compared to each other. Each assembly algorithm assembled 
the identical system matrix as expected. Also included in Table 1 are 
the number of grid lines NX. NY. and NZ and the matrix order .V that 
w ere used to perform the computations at each frequency. Here we 
have used the generally accepted rule that 12 points per wavelength 
is required to resolve a cut-on mode in each coordinate direction. To 
establish the accuracy of the solver solutions, the relative error norm 
iRelern. computed from the solver solution vector, was tabulated in 
the final column of Table 1. The relative error norm* is defined as 


f\ — 9 /? exiI ( A , Vy ) 3 / 3 ex „(A/ , Vy + j ) 

4- ficxv. [ - X - i ♦ }'J - 1 ) + 3 Arx,, (a / , Vy + \ ) 
fz = 3/? exi! (A . Vy ) - 1 - ftcxn Li / , Vy + j)] 


Reierr = 


[EN r X (£N} r 

{E] m x \F] T 


yy^i 


) * ficxu (A/ . Vy 1 ) 


{HV} = [yi]{4>}-{F} 


/ 

sx 

vr 

NZ 

N 

Algorithm i 

Algorithm 2 

Algorithm 3 

Solver 

Relerr 

4.00 

6 

6 

114 

4,104 

4920 

034 

0.22 

6.00 

6.5 x 10-’ 5 

7.00 

12 

12 

200 

28.800 

106.80 

230 

1.75 

22.80 

1.8 x 10 -12 

11.00 

18 

18 

313 

101.412 

U23.80 

9.05 

228 

487.80 

7.5 x 10- 12 

14.00 

24 

24 

399 

229.824 

5.520.60 

20.74 

14.24 

3.120.00 

3.4 x lO -11 

17.00 

30 

30 

484 

435.600 

19.488.00 

39.78 

26.94 

10.440.00 

3.2 x 10- 11 

21.00 

36 

36 

599 

776.304 

N/A 

73.81 

48.35 

N/A 

N/A 


Table 2 RAM statistics (in megabytes) for the sparse algorithms 


/ 

N 

.VI 

Algorithm 2 

Algorithm 3 

Solver 

4.00 

4.104 

41.468 

0.47 

0.46 

4.00 

7.00 

28.800 

331.244 

21.00 

11.00 

80.00 

11.00 

101.412 

1.216.118 

72.00 

37.00 

640.00 

14.00 

229.824 

2.812.838 

165.00 

83.00 

2140.00 

17.00 

435.600 

5.396.600 

3 i 7.00 

158.00 

8.100.00 

21.00 

776.304 

9.696.158 

551.00 

283.00 

N/A 


Tabular results at 21 kHz are not presented for assembly algorithm 1 
and the sparse equation solver because of the excessive CPU time 
required by these two algorithms. 

Although algorithm I is extremely simple, its performance is 
extremely slow (Tubie i ). Note that algorithm 1 is 145 times slower 
than the other two algorithms at a frequency of 4 kHz and more 
than 490 times slower at 17 kHz. Tabular results also show that the 
CPU time required to assemble the system matrix using algorithm 1 
exceeds that required to obtain the solution vector by 904K s (or 
S7G ) at 17 kHz. At low frequencies, algorithm 2 is only slightly 
slower than algorithm 3. but as the frequency increases to 17 kHz. 
algorithm 3 is 32G faster than algorithm 2. Generally, the higher 
the frequency, the better the performance of algorithm 3. relative to 
that of algorithm 2. Furthermore, in using algorithm 2. the user has 
to guess the maximum number of nonzero terms per row i \/Z) to 
allocate the RAM for the matrix |//A 1. Also, the CPU times required 
to assemble the system matrix using algorithm 2 or algorithm 3 are 
both more than two orders of magnitude less than the time required 
to obtain the solution vector. Finally. Relerr is small, indicating that 
the solver solution is accurate. 

Table 2 shows the RAM (in megabytes) for algorithm 2. algo- 
rithm 3. and the sparse equation solver. RAM statistics for algo- 
rithm 1 were not tabulated because its performance was extremely 
slow when compared to algorithm 2 and algorithm 3 (as shown in 
Table 1 ). Values of the variables ;V and .V 1 are also given in Table 2. 
The results show that the number of off-diagonal nonzero coeffi- 
cients (V l ) is an order of magnitude larger than .V. Table 2 also 
shows that algorithm 3 requires less memory than algorithm 2 be- 
cause algorithm 2 must allocate RAM for storing vectors \IR). |JC). 
and [//A] [see Eqs. < 13-15)]. Note also that memory required by the 
sparse equation solver is substantially larger than that required for 
assembly algorithm 2 or algorithm 3. This is further verification that 
most of the RAM allocated is used during matrix factorization. Pre- 
liminary results from tests conducted by the authors have suggested 
that the performance of the sparse equation solver may improve if 
the solver were to use METIS instead of MMD reordering. Forex- 
ample. at 7 kHz the number of nonzeros after factorization i ,V2 1 was 
reduced from 4.736.991 with MMD reordering to only 4.3" , 6.496 
when the METIS reordering algorithm was used. 

VI. Conclusions 

A template for symmetric sparse equation assembly and solutions 
on an unstructured grid has been presented. The accuracy and nu- 
merical performance of the sparse algorithms have been evaluated 
over the frequency range of interest in a three-dimensional aeroa- 
coustics application. Based on the results of this study, the following 
conclusions are drawn: 

1 ) Assembly aigonrhm 1 is impractical for sysiem matrix assem- 
bly at high values of source frequency. It requires up to 87 r r more 
CPU time to assemble the system matrix than the sparse equation 
solver requires to obtain the solution vector. 


2) Assembly algorithms 2 and 3 have nearly equal performances 
at low values of source frequency, but algorithm 3 gives savings 
in both CPU time (32 Cc) and RAM (507c) at the higher values of 
source frequency. 

3) Error norm statistics show’ that the sparse equation solver com- 
putes accurate acoustic solutions over the frequency range of interest 
for the three-dimensionai aeroacoustics application. 

4) At high frequency ( 17 kHz 1. the sparse equation solver requires 
low memory, but requires significant speed-up before optimization 
studies (either of the duct geometry or liner material properties) are 
practical. This research supports a recommendation. Therefore, that 
a parallel version of the sparse solver be developed. The CPU time 
and RAM required by assembly algorithms 2 and 3 are two orders of 
magnitude smaller than that required by the sparse equaiion solver. 
These algorithms can. therefore, be conveniently incorporated into 
a substructunng lor domain decomposition) formulation (provided 
that each substructure is handled by different processors) to take 
advantage of parallel computation to further reduce CPU time and 
RAM. ” 
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Ins text is a practical guide to 
. buiuimg Kalman filters and 
show s how the filtering equations 
can be applied to real-life problems. 
Numerous examples are presented in 
detail show ing the mans wavs in 
which Kaiman filters can be 
designed Computer code written in 
FORTRAN, MATLAB, and True 
BASIC accompanies all of the exam- 
ples so that the interested reader can 
verify concepts and explore issues 
beyond the scope of the text. 
Sometimes mistakes are introduced 
intentionally to the initial filter 
designs to show the reader what hap- 
pens when the filter is not working 



properly. The text spends a great 
deal of time setting up a problem 
before the Kalman filter is actually 
formulated to give the reader an 
intuitive feel for the problem being 
addressed. Real problems are sel- 
dom presented in the form of dif- 
ferential equations and they usually 
don't have unique solutions. 


Therefore, the authors illustrate several 
different filtering approaches for tackling 
a problem Readers will gain experience in 
software and performance tradeoffs for 
determining the best filtering approach 
for the application at hand. 
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